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ABSTRACT 


Title  of  Thesis:  New  Results  in  Discrete-Time  Nonlinear  Filtering 
Name  of  Candidate:  Richard  Bucher  Sowers 
Degree  and  Year:  Master  of  Science,  1988 

Thesis  directed  by:  Armand  Makowski,  Associate  Professor,  Department  of  Electrical 
Engineering,  University  of  Maryland  at  College  Park 

We  consider  a  discrete-time  linear  system  with  correlated  Gaussian  plant  and  obser¬ 
vation  noises  and  non-Gaussian  initial  condition  independent  of  the  plant  and  observation 
noises.  We  firstly  find  a  solution  for  the  filtering  problem;  we  find  a  representation  for 
the  conditional  distribution  of  the  state  at  time  t  given  the  observations  up  to  time  t  —  1. 
This  representation  is  in  terms  of  a  finite  collection  of  easily-computable  statistics.  With 
this  solution  to  the  filtering  problem,  we  then  find  representations  for  the  MMSE  and 
LLSE  estimates  of  the  state  given  the  previous  observations,  and  the  mean-square  error 
between  the  two.  (Of  course  the  MMSE  estimate  will  in  general  be  a  nonlinear  func¬ 
tion  of  the  observations,  whereas  the  LLSE  estimate  is  by  definition  linear  and  is  given 
by  the  Kalman  filtering  equations.)  We  then  consider  the  asymptotic  behavior  of  the 
mean-square  error  between  the  MMSE  and  LLSE  estimates  as  time  tends  to  infinity.  We 
find  conditions  on  the  system  dynamics  under  which  the  effects  of  the  initial  condition 
die  out;  under  these  conditions  the  non-Gaussian  nature  of  the  initial  condition  becomes 
unimportant  as  t  becomes  large.  The  practical  value  of  this  result  is  clear — under  these 
conditions,  the  LLSE  estimate,  which  is  usually  less  costly  to  generate  than  the  MMSE 
estimate,  is  asymptotically  as  good  as  the  MMSE  estimate  (i.e.,  asymptotically  optimal) 
in  the  mean-square  sense. 
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CHAPTER  Is  INTRODUCTION 


1.1.  Problem  Statement  and  Outline 

We  consider  the  one-step  prediction  problem  associated  with  the  stochastic  discrete¬ 
time  linear  dynamical  system 

X?+1  =  AtX°  +  Wt°+1 

=  z  t  =  0,1,...  (1.1) 

Yt  =  HtX?  +  Vt°+1 

which  is  defined  on  some  underlying  probability  triple  P)  carrying  the  7/?”- valued 

plant  process  {T°}“  and  the  IRk- valued  observation  process  {Tt}o°.  Throughout,  we 
shall  make  the  following  assumptions: 

(A.l):  the  process  {(Wt+1,'Pt+1)}o0  is  a  zero-mean  Gaussian  White  Noise  (GWN)  se¬ 
quence  with  covariance  structure  {r*+i  } given  by 


(A. 2):  for  all  t  —  0, 1, . . . ,  the  covariance  matrix  E<+1  is  positive  definite, 

(A. 3):  the  initial  condition  £  has  distribution  F  with  finite  first  and  second  moments  p 
and  A  (resp.)  and  is  independent  of  the  process  {(^f0+1,V(0+1)}”  and 
(A.4):  the  covariance  matrix  A  is  positive  definite. 

Note  that  no  a  priori  assumptions,  save  those  on  the  first  two  moments,  is  enforced 
on  F. 

Define  Z  as  the  vector  space  of  all  bounded  Borel  mappings  from  lRn  into  C,  the 
complex  numbers.  The  one-step  prediction  problem  (hereafter  referred  to  simply  as  the 
“prediction  problem”)  associated  with  (1.1)  is  defined  as  the  problem  of  computing,  for 
each  t  =  0,1,...,  the  conditional  distribution  of  the  state  It°+1  given  the  observations 
{Fs}o  or,  equivalently,  the  evaluating,  for  all  t  =  0, 1, . . .  and  all  <j>  in  Z,  the  conditional 
expectation 

E[<t>(X?+1)\Ys-,s  =  0,1,...,/].  (1-3) 
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In  this  thesis,  we  shall  solve  the  prediction  problem  associated  with  (1.1).  For  each 
t  —  0,1,...,  once  the  conditional  distribution  of  Xt°+1  given  {F«}o  is  available,  it  is 
then  possible  to  construct  Xt+\  :=  F[X°+i|Fs;  s  =  0, 1, . . . ,  t\.  In  general,  Xt+i  is  a 
nonlinear  function  of  {Fs}o,  in  contrast  to  the  ’Kalman’,  or  LLSE,  estimate  of  X°+1  on 
the  basis  of  {Ys}o,  which  is  by  definition  linear,  and  which  we  denote  by  X/j. j .  We  shall 
find  representations  for  both  {Xjf°  and  {X/*- }£°  and  then  form  the  mean  square  error 

et  :=  J5||Xf  —  Xf|i2]  for  t  —  1,2, _  Simply  stated,  et  is  a  measure  of  the  agreement 

between  the  MMSE  and  LLSE  estimates  of  X°  on  the  basis  of  {VJq-1,  for  t  =  1,2, - 

The  final  efforts  of  this  thesis  will  be  to  analyze  the  asymptotic  behavior  of  et  as  t  tends  to 
infinity — the  asymptotic  mean-square  agreement  of  the  true  conditional  and  wide-sense 
conditional  expectation  of  the  state  given  the  observations.  This  analysis  shall  focus 
on  the  time-invariant  version  of  (1.1),  when  At  —  A,  Ht  =  77,  and  Tt+]  =  T  for  all 
t  =  0, 1, . . ..  Then  we  can  parametrize  the  asymptotic  behavior  of  {c*}x°  by  the  system 
(A,  77,  T)  and  the  initial  distribution  F.  We  are  particularly  interested  in  triples  (A,  77,  T) 
and  distributions  F  for  which  lin^c*  =  0,  for  then  we  have  the  important  result  that  the 
LLSE  estimates  {X/'jf3  are  asymptotically  as  good  as  the  MMSE  estimates  {X*}^;  the 
LLSE  estimates  are  asymptotically  optimal  in  the  mean  square  sense.  The  practical  value 
of  this  is  clear-the  LLSE  estimates  are  usually  less  costly  to  generate  than  the  nonlinear 
MMSE  estimates. 

The  thesis  is  organized  as  follows.  In  the  remaining  section  of  Chapter  I,  we  introduce 
notations  to  be  used  in  what  follows.  In  Chapter  II,  we  review  the  discrete-time  Girsanov 
mutually  absolutely  continuous  change  of  measure,  which  shall  enable  us  to  solve  the 
prediction  problem  for  (1.1).  We  discuss  some  aspects  of  the  infinite-horizon  Girsanov 
transformation  in  the  second  section  of  Chapter  II.  Chapter  III  develops  the  discrete¬ 
time  counterpart  of  [17]  and  [19] — the  case  where  the  plant  and  observation  noises  are 
uncorrelated  and  the  observation  nose  has  unit  covariance.  We  call  this  the  “uncorrelated” 
problem,  and  the  calculations  of  Chapter  III  concerning  the  uncorrelated  problem  serve 
primarily  as  a  point  of  departure  for  the  solution  to  the  more  general  “correlated”  problem, 
which  is  found  in  Chapter  IV.  By  the  “correlated”  problem,  we  refer  to  the  case  where 
the  plant  and  observation  noise  jointly  form  a  GWN  sequence  with  any  covariance  such 
that  is  positive-definite  for  all  t  =  0, 1, _ 

Once  the  solution  to  the  prediction  problem  associated  with  (1.1)  is  known,  we,  in 
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Chapter  V,  turn  to  the  task  of  finding  representations  for  {Xfjj’0,  {X^jf1  and 
Recall  that  for  t  =  1, 2, . . Xt  is  the  MMSE  estimate  (or  conditional  expectation  of)  X° 
on  the  basis  of  (Yo,  Yi, . . . ,  Yt_i),  while  is  the  LLSE  estimate  of  X°  on  the  basis  of 


(Y0,  Yi , . . . ,  Y_i ),  and  that  et  :=  E 


\\Xt  -  Xf  ||2] . 


Chapter  VI  is  devoted  to  an  analysis  of  the  asymptotic  behavior  of  for  the 

time-invariant  version  of  (1.1).  In  Section  2  of  Chapter  VI,  we  use  a  result  of  Caines  and 
Mayne  to  find  conditions  on  {A,H,T)  such  that  for  any  initial  distribution  F,  we  have 
lim*^  =  0  with  bounds  on  the  rate  of  convergence  also  being  independent  of  the  initial 
distribution  F  for  non- Gaussian  distributions  F  (of  course  if  F  is  Gaussian,  then  ef  =  0 
for  all  t  =  1,2,...  and  all  systems  (A,H,  T),  since  the  nonlinear  and  Kalman  estimates 
coincide).  We  then  further  restrict  ourselves  to  the  scalar  case  in  Section  3 — here  the 
plant  and  observation  processes  take  values  in  IRn  =  IRk  =  IR.  In  the  scalar  case,  we 
develop  a  complete  characterization  of  the  asymptotics  of  as  parametrized  by  the 

time-invariant  plant  (a,  h,  T)  and  the  initial  distribution  F.  We  shall  find  that  if  (a,h,T) 
satisfies  a  generalized  version  of  the  criterion  presented  in  Section  2,  then  lim*e*  =  0 
with  the  rate  of  decay  also  determined  only  by  the  dynamics  (a,h,T).  Conversely,  we 
shall  find  that  if  ( a,h,T )  satisfies  a  certain  instability  criterion,  then  the  asymptotic 
behavior  of  {ci)i°  depends  nontrivially  upon  F  to  the  extent  that  for  some  distributions 
F,  limte*  =  0,  while  for  other  distributions  F,  lim<Cf  >  0.  The  significance  of  these  results 
is  the  subject  of  Section  4. 
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1.2.  Background  of  the  Problem 


Filtering  theory  is  an  extremely  well-developed  field.  The  Kalman  filtering  equations 
were  first  published  in  1961  [11],  and  in  the  literature  of  the  past  three  decades,  a  vast 
amount  of  theory  has  been  developed.  The  reader  may  wish  to  consult  [10]  for  a  recent 
bibliography  of  filtering  theory. 

The  main  contributions  of  this  thesis  are  twofold.  Firstly,  we  extend  the  work  of 
Makowski  [18]-[19]  to  cover  correlated  plant  and  observation  noises  with  a  non-Gaussian 
initial  condition.  Secondly,  we  study  the  asymptotic  behavior  of  The  filtering  or 

prediction  problem  for  linear  systems  with  a  non-Gaussian  initial  condition  and  uncor¬ 
related  plant  and  observation  noises  has  been  solved  in  [1],  [2],  [17]-[19],  [21],  [22]  and 
[24].  In  [19],  a  linear  system  with  Gaussian  initial  condition  and  observation  noise,  but 
general  non-Gaussian  plant  noise  was  considered.  Of  course,  this  problem  overlaps  the 
one  considered  in  this  thesis,  since  in  [19],  the  non-Gaussian  plant  noise  may  be  taken 
to  be  the  effect  of  the  non-Gaussian  initial  condition.  In  [17]  and  [18],  Makowski  studied 
a  continuous-time  linear  system  with  non-Gaussian  initial  conditions  and  uncorrelated 
Gaussian  plant  and  observation  noise.  The  discrete-time  counterpart  of  these  two  pa¬ 
pers,  also  analyzed  in  [14]  and  [21],  is  developed  in  Chapter  III  of  this  thesis.  Benes  and 
Karatzas,  in  [2],  analyzed,  with  a  control-theoretic  orientation,  a  continuous- time  linear 
system  with  non-Gaussian  initial  condition  and  uncorrelated  Gaussian  plant  and  obser¬ 
vation  noises.  In  [24],  a  solution  is  presented  for  a  generalized  filtering  problem  in  which 
a  continuous-parameter  state  process  takes  values  in  some  Polish  space  and  is  observed 
through  discrete-time  observations  in  IRm .  Finally,  in  [1],  [21]  and  [22],  a  specific  class 
of  non-Gaussian  initial  distributions  is  considered,  namely,  distributions  admitting  a  den¬ 
sity  with  respect  to  Lebesgue  measure  on  (JR,  B(IR))  given  by  a  convex  combination  of 
non-degenerate  Gaussian  densities. 
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1.3.  Notation 


We  now  define  several  notational  conventions  which  will  simplify  future  presentations. 
We  follow  the  notation  of  [18]. 

Let  $(• ,  •)  be  the  state  transition  matrix  associated  with  {A*}o°: 


$(M)  =  !n 

$(s  +  M)  =  Aa$(s,t). 


s  =  t,t+  1,...  ,  t  =  0,1,...  (3.1) 


Similarly,  let  $(•,•)  be  the  state  transition  matrix  described  by 


*(M)  =  In 

*(«  +  !,<)  =  [*•  - 


s  =  M  +  1,... ,  t  =  0,1,...  (3.2) 


For  any  positive  integers  n  and  m,  let  Mnxm  be  the  space  ofnxm  real  matrices 
and  let  Qn  be  the  cone  ofnxn  symmetric  positive-definite  matrices. 

Take  A  positive  definite  in  Qn,  and  let  ||  •  ||a  be  the  norm  on  IRn  defined  by 

||x||a  :=  Vx'Ax.  x  £  IRn  (3.3) 

For  convenience,  define  ||  •  ||  :=  ||  •  ||/„  • 

For  n  =  1, 2, . . . ,  let  An  represent  n-dimensional  Lebesgue  measure  on  the  measurable 
space  (Bln,B(IRn)). 

Finally,  let  (fl',  P')  be  a  probability  triple  such  that  for  every  E  in  Q2 n,  there 

are  JRn -valued  RV’s  X%,  B%,  and  Ce  where  (Xs, By,)  is  a  zero- mean  Gaussian  RV  with 
covariance  E,  and  where  Ce  has  distribution  F  and  is  independent  of  (X^,B^).  Let  E' 
be  the  expectation  operator  associated  with  P'.  Then  for  every  4>  in  Z,  let  the  mapping 
T<f> :  IRn  x  IRn  X  Q,2n  —*■  C  be  defined  by 

T<f>[x,  6;  E]  :=  E'[<f>(x  +  XE)  exp[6'5E]]  (3.4) 

and  the  mapping  U<j> :  JRn  X  IRn  x  Qn  x  MnXn  X  Q2n  -*  C  be  defined  by 

U<j>[x,b,A,y-,Y,]  :=  E'[T<j>[x  +  Cs!  S]  exp[6'CE  -  ^Ce^Ce]]-  (3.5) 
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CHAPTER  II:  THE  GIRSANOV  TRANSFORMATION 


II. 1.  The  Finite-Horizon  Girsanov  Transformation 

We  here  develop  the  discrete-time,  finite-horizon  Girsanov  mutually  absolutely  con¬ 
tinuous  measure  transformation,  which  plays  a  central  role  in  Chapters  III  and  IV  of  this 
thesis  (see  [8]).  The  arguments  follow  [5].  The  reader  is  also  referred  to  [6,  Chaps.  2  & 
3]  for  a  discussion  of  predictable  and  adapted  discrete-time  processes  and  discrete-time 
martingales. 

Consider  a  probability  triple  P )  (not  necessarily  the  same  as  the  one  given  in 

Section  1.1)  and  a  filtration  {^t}o°  of  T .  Let  {Vi)i°  be  an  IEn-valued  (^rt)-zero-mean 
GWN  process  with  covariance  structure  {Aj}^  given  by 

At  :=  Cov(V'f)  =  E[VtVt'],  t  =  1,2,...  (1.1) 

and  let  {x*}x°  be  an  LRn-valued  (.F*)~predictable  process.  Define  a  third  2R”-valued 
(J7t)-adapted  process  {Vt}i°  by 


Vt  :=  Vt  —  AfXt-  t  =  1,2,...  (1.2) 

Fix  T  =  0,1,....  Then  the  Girsanov  transformation  provides  a  measure  P  on 
with  the  following  properties: 

(B.l):  the  measure  P  is  a  probability  measure  on  (Q,E)  which  is  mutually  absolutely 
continuous  with  P  and  which  agrees  with  P  on  J-q  and 
(B.2):  under  P,  the  process  {  is  an  {Tf)  zero-mean  GWN  process  with  covariance 
structure  {At}^ — the  statistics  of  {Vt}J  under  P  are  the  same  as  the  statistics 
of  {Vt}i  under  P. 

Note  that  in  (B.2),  we  make  a  statement  only  about  the  finite-horizon  process  {Vt}\  ,  and 
not  the  infinite-horizon  process  {V/}“.  This  is  a  finite-horizon  measure  transformation; 
under  an  infinite-horizon  measure  transformation,  the  entire  process  {Vt}fi  would  be  a 
P-GWN  process.  We  note  also  that  two  mutually  absolutely  continuous  measures  on  the 
same  measurable  space  are  said  to  be  equivalent. 
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Rather  than  directly  defining  the  measure  P,  we  first  consider  the  (JF<)- adapted 
process  {£t}£°  given  by 


Lt  :=  T7  exp[x'sVs  ~  rxlAsXs] 

S=1 

L0  •—  1; 


t=  1,2,...  (1.3) 


note  that  (1.3)  may  be  rewritten  as 


Lt 


f  -  1 

n  exP[x'sK  +  rx'.A.X.]- 


S  =  1 


<  =  1,2,...  (1.4) 


Define  the  measure  P  by  the  Radon-Nikodym  derivative 


dP 

dP 


=  Lt- 


(1.5) 


For  convenience,  define  the  mapping  Jt  :  IRn  x  IRn  —  III  for  each  i  —  1,2,...  by 


Jt[v,x]  :=  exp[a;'u  -  -x' Atx\, 


t  —  1,2,...  (1.6) 


so  that  for  t  —  1, 2, ... , 


Lt=nw’X<}. 


(1.7) 


5=1 


We  shall  repeatedly  use  the  following  standard  result  ([13,  Prop.  6.1.16]). 

Lemma  1.1.  Suppose  that  X  and  Y  are  LRn -valued  RV’s  and  Q  is  a  sub-cr-field  of  T. 
Suppose  that  Y  is  Q -measurable  and  X  is  independent  of  Q .  For  any  bounded  Borel 
mapping  <p  :  LRn  x  LRn  — *■  C,  then 


E[<p(X,Y)\G]  =  E[<p(X,y)\ 


P-a.s. 


(1.8) 


The  following  results  are  of  paramount  importance  in  what  follows. 

Lemma  1.2.  For  all  t  —  0, 1, . . .  and  x  in  IRn , 

E[jt[Vt,x}]  =  1  (1.9) 

so  that 
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E[Jt[VuXt]\Ft-i]  =1- 


(1.10) 


Proof.  Equation  (1.9)  follows  by  direct  evaluation  of  the  expectation.  For  each  n  = 
0,1,...,  Lemma  1.1  implies  that 


E  [Jt[Vu  xt]  A  n  |  Et-\]  =  E  [Jt[Vu  x]  A  n  \  Et-\] 


x=Xt 


(i.ii) 


and  for  all  x  in  LRn,  the  Monotone  Convergence  Theorem  and  (1.9)  yield  that 


E[Jt[Vt,Xt]  A  n|jFt_i]  Z  1. 


(1.12) 


Taking  the  expectation  of  both  sides  of  (1.11)  and  using  (1.12),  we  easily  verify  that 
Jt[Vt, Xt\  is  integrable.  Relation  (1.10)  is  verified  by  passing  to  the  limit  in  (1.11)  and 
using  (1.12).  O 

To  proceed  with  the  verification  of  (B.l),  we  first  show 
Proposition  1.1.  The  process  {Lt}o°  is  an  (Et,  P)-martingale. 

Proof.  By  inspecting  (1.3),  we  see  that  {Lt} g°  is  (Z<)-adapted.  Fix  t  —  1,2,...  and 
assume  that  Lt- 1  is  integrable.  If  we  can  show  that  for  any  A  in  the  relation 


E\\aU- i]  =  E[lALt\  (1.13) 

holds,  then  the  conclusion  readily  follows.  Indeed,  by  setting  A  —  Cl,  we  may  verify 
that  Li  is  integrable,  so  by  induction  on  t  and  the  obvious  integrability  of  Lo,  Lt  will  be 
integrable  for  each  t  =  0,1, _ If  (1.13)  is  true  for  all  A  in  Tt- 1,  then  also 

E[Lt\Et-i]  =  Lt- 1  P-a.s.,  (1.14) 


so  will  in  fact  be  an  (Z4)-martingale. 

The  proof  of  (1.13)  is  straightforward.  Using  Lemma  1.2  and  the  fact  that  Lt~ i  is 
-measurable,  we  have  that 


E[lAJt[Vt,xt](Lt-i  A  n)]  =  E[lAE[Jt[VuXt]  I  A  n) ] 

=  %(£wAn)] 


(1.15) 
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for  each  n  =  0,1,....  Passing  to  the  limit,  we  verify  (1.13)  by  the  Monotone  Convergence 
Theorem.  O 

Now  for  each  t  =  0, 1, . . .  ,  T,  let  Pt  (resp.  Pt )  be  the  restriction  of  P  (resp  P) 
to  Tt\  clearly  Pt  <C  Pt  for  t  =  0,1,..., T.  The  following  result  provides  an  alternate 
characterization  of  the  process  {£t}J. 

Proposition  1.2.  For  t  =  0, 1, . . . ,  T, 


Lt  = 


dPi. 

dPt' 


(1.16) 


Proof.  We  must  show  that  for  t  =  0,1 , . .  .,T,  the  relation 


Pt(A)=  [  LtdPt  (1.17) 

J  A 

holds  for  any  set  A  in  Tt-  But  for  A  in  Tt, 

Pt(A )  =  P(A)  (1.18) 

=  JALTdP  (1.19) 

=  /  E[LT\Pt]dP  (1.20) 

J  A 

=  [  LtdP  (1.21) 

J  A 

=  [  LtdPu  (1.22) 

J  A 

so  (1.17)  is  true.  Q 


The  verification  that  P  has  property  (B.l)  is  now  trivial. 

Proposition  1.3.  The  measure  P  is  a  probability  measure  on  (O,  IF)  enjoying  property 

(B.l). 

Proof.  Note  by  Proposition  1.2  that 


dP0 

dP0 


—  Lq  —  1, 


(1.23) 


so  P  and  P  agree  on  JF0;  in  particular,  P(O)  =  P(Q)  =  1,  so  P  is  a  probability  measure. 
Since  P{Lt  =  0}  =  0,  P  and  P  are  mutually  absolutely  continuous  (see  [15,  Lemma  6.8]). 

o 
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From  the  mutual  absolute  continuity  of  P  and  P  and  Proposition  1.2,  we  note  that 


f  =  0,l,...,T  (1.24) 


Let  E  be  the  expectation  operator  associated  with  P.  The  following  result  indicates  the 
structure  of  under  P. 

Proposition  1.4.  The  process  {LJx}q  is  an  (Et,P)-  martingale. 

Proof.  By  inspection  of  definition  (1.3),  we  see  that  is  both  well-defined  and  (Jrt)- 

adapted,  and  from  (1.24),  L^1 ,  is  automatically  P-integrable.  To  prove  the  proposition, 
it  is  then  sufficient  to  show  that  for  t  =  0,1 , . .  .,T  and  A  in  P), 

dP 

e[ial;1)  =  E[ia-~]-,  (1.25) 


then 

But  (1.25)  is  trivially  true,  since 


E[1ALTX]  =  E[1A1J-}  =P(A), 


f  =  0,l,...,P  (1.26) 


(1.27) 


and 


(1.28) 


Hence  the  proposition  holds.  Q 

The  following  result  relates  the  conditional  expectation  operators  under  P  and  P. 
Proposition  1.5.  For  t  =  1, 2, . . . ,  T  and  any  bounded  C-valued  Ft-measurable  RV  X, 

E[X \Ft-i\  =  (1.29) 

=  E[XJt[Vt,xt]\Ft-i].  (1.30) 


Proof.  Since  X  is  bounded,  XLt  is  clearly  P-integrable.  By  Lemma  1.2,  JtWtiXt]  is 
P-integrable,  so  XJt[Vt,Xt]  is  P-integrable.  To  prove  (1.29),  it  is  sufficient  to  show  that 
for  any  A  in  Tt- 1, 

E[lALt^E[X  =  E[lAE[XLt\Et^]}.  (1.31) 
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By  arguments  which  should  now  be  clear,  we  have 


=E[lAXLt\ 

(1.32) 

(1.33) 

=  F[DX], 

(1.34) 

whereas 

dP 

ElULt-iEWTt-!]]  = 

(1.35) 

(1.36) 

=  F[1XI]; 

(1.37) 

thus  (1.31)  holds,  and  the  proof  is  complete. 

0 

We  can  now  verify 

Proposition  1.6.  The  probability  measure  P  enjoys  property  (B.2). 

Proof.  It  is  sufficient  to  verify  that  for  any  t  =  1,2,...  and  each  6  in 

IRn, 

F’[exp[«6»,Vt]|X:)_i]  =  exp[-i0'A<0]. 

(1.38) 

Now  by  Proposition  1.5,  we  see  that 

£[exp[i0'Vt]|.F*_1]  =  E[exp[i9'Vt}Jt[Vt,Xt]\Ft-i], 

(1.39) 

where  some  care  must  be  taken  to  ensure  that  the  appropriate  integrability  conditions 

are  satisfied.  Lemma  1.2  ensures  that  Jt[Vt,Xt]  is  P-integrable,  and  since  the  complex 

exponential  function  has  magnitude  1,  we  see  that  exp[i^'yt]  Jt[Vt,  Xt\  is  also  P-integrable. 

For  n  —  1,2,...,  define  Tn  :  C  — »  C  by 


l n (x )  . — 


G  C  (1.40) 


Now  from  the  Dominated  Convergence  Theorem  for  conditional  expectations,  we  get 


E[vLj>[i6,Vt]Jt[Vt,xt]\Ft-i]  =Hm nE  Tn(exp[ie'Vt]Jt[Vt,Xt}) 


n  —  1,2,...  (1.41) 
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and  Lemma  1.1  gives 


E 


Tn(exp[i9'Vt)Jt[VuXt}) 


Ft- 1 


=  E 


Tn(exp[i0'Vt  -  i0'Atx]Jt[Vt,x]) 


x=Xt 


For  each  n  =  1,2,...  and  each  x  in  IRn , 


Tn  ( exp[i0' V4  -  iO'Atx]Jt[Vt,x])  <  Jt[Vt,x ], 


n  =  1,2,...  (1.42) 


(1.43) 


so 


limnTn ( exp[id'Vt  -  id'Atx]  Jt[Vt,  x])  =  exp [i9'Vt  -  iO'Atx]  Jt[Vt,  x].  (1-44) 

Applying  Lemma  1.2  and  the  Dominated  Convergence  Theorem,  we  conclude  that 


lim„.E 


Tn(exp[i9'Vt  -  iO'Atx]Jt[Vux ]) 


=  E 


v[i0'Vt-i9'Atx]Jt[Vt,x]\ 
=  E  exp[(i0  +  x)'Vt  —  ^x' Atx  - 


=  exp[-ix'Atx], 


(1.45) 

(1.46) 

(1.47) 


where  (1.47)  holds  by  direct  evaluation  of  (1.46).  Upon  combining  (1.39),  (1.41),  (1.42) 
and  (1.47),  we  verify  (1.38).  0 

The  following  result,  which  we  shall  use  in  Chapters  III  and  IV,  is  proved  in  exactly 
the  same  manner  as  Proposition  1.5  (see  [16,  Sec.  28.4]). 

Proposition  1.7.  For  t  =  1,2  ,...,T  and  any  bounded  C -valued  T \-measurable  RV  X, 


rrvir  1  E[XLt  1  \Ft-\] 


(1.48) 


Proof.  The  proof  of  (1.48)  is  the  same  as  Proposition  1.5  if  we  reverse  the  roles  of  P  and 
P  and  note  that  which  results  from  Proposition  1.4.  0 

The  Girsanov  transformation  presented  in  [5]  is  slightly  less  general  than  the  one 
presented  here.  In  [5],  it  is  assumed  that  {Vt}f  is  a  standard  GWN  sequence  (i.e.,  At  =  In 
for  t  =  1,2,...  ).  The  case  of  a  general,  non-standard,  GWN  process  could  have  been 
considered  within  the  framework  of  [5]  by  normalizing  {W}J°  to  have  unit  variance,  but 
the  approach  presented  here  is  more  direct. 
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II. 2.  The  Infinite-Horizon  Girsanov  Transformation 


In  this  section,  we  attempt  to  extend  the  results  of  the  previous  section  to  the  infinite 
horizon.  The  notation  is  the  one  introduced  in  the  previous  section.  We  seek  a  probability 
measure  P  on  (ft,  T)  which  is  mutually  absolutely  continuous  with  P  and  which  enjoys 
property  that 

(C.l):  the  probability  measure  P  agrees  with  P  on  JF0,  and  {Vi}i°  is  an  (P),  P)  zero- 
mean  GWN  process  with  covariance  structure  {At}i°. 

Our  starting  point  shall  be  definition  (1.4)  and  Proposition  1.2.  For  t  =  0, 1, . . .,  we 
define  a  probability  measure  P<  on  (ft,  P)  through  the  Radon-Nikodym  derivative 


(2.1) 


By  Proposition  1.2,  we  know  that  Pt+i  and  Pt  agree  on  Tt  for  each  t  =  0, 1, . . ..  The 
problem  is  then  to  determine  if  {A}o°  in  some  sense  converges  to  the  sought-after  prob¬ 
ability  measure  P  satisfying  (C.l)  and  which  may  be  supposed  to  also  satisfy  (C.2), 
where 

(C.2):  for  t  =  0, 1, . . .  Pt  and  P  agree  on  Pt. 

Note  that  by  the  arguments  of  the  previous  section,  property  (C.l)  in  fact  follows 
from  property  (C.2).  Throughout  this  section,  E  shall  denote  the  expectation  operator 
associated  with  the  sought-after  probability  measure  P,  and  Et  shall  be  the  expectation 
operator  associated  with  Pt  for  t  =  0, 1, . . . . 

We  shall  see  that  the  existence  of  a  probability  measure  P  on  (ft,  P)  equivalent  to  P 
and  satisfying  properties  (C.l)  and  (C.2)  is  closely  related  to  the  uniform  P-integrability 
of  {if}o°.  We  shall  investigate  the  ramifications  of  the  uniform  P-integrability  of  {Lt} q° 
and  provide  a  counter-example  to  show  that  {Lt} o°  need  not  be  uniformly  P-integrable. 
We  shall  then  provide  a  sufficient  condition  for  uniform  P-integrability  which  will  have 
a  pleasing  interpretation  in  later  parts  of  this  thesis.  Finally,  we  shall  show  that  even 
when  {£f}o°  is  not  uniformly  P-integrable,  if  the  filtration  {P))o°  satisfies  a  separability 
condition,  then  the  Daniell-Kolmogorov  theorem  enables  us  to  construct  a  probability 
measure  P  on  (ft,  Poo)  satisfying  conditions  (C.l)  and  (C.2),  but  which  need  not  satisfy 
any  absolute  continuity  conditions  with  respect  to  P. 
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As  a  first  step,  from  Section  2,  we  immediately  see  that  is  a  nonnegative 

{Tti  P)  martingale.  Thus  by  well-known  results  ([6,  Cor.  3.17])  there  is  a  nonnegative  RV 
Loo  such  that 

limt£j  =  Loo  P-a.s.  (2.2) 

and  E[Loo\  <  1  (by  Fatou’s  Lemma). 

The  following  classical  result  indicates  the  significance  of  uniform  P-integrability  of 

{Ltl 0°°. 

Theorem  2.1.  Uniform  P-integrability  of  {Lt} q°  is  a  necessary  and  sufficient  condition 
for  the  existence  of  a  probability  measure  P  on  (fi,P)  satisfying  (C.l)  and  (C.2)  and 
with  P  <C  P.  Furthermore,  if  {Lt}o°  is  uniformly  P -integrable,  uniform  P-integrability 
offLf1}^  is  both  a  necessary  and  sufficient  condition  for  the  equivalence  of  P  and  P. 

Proof.  The  theorem  results  from  [20,  Prop  III-l-l]  and  [20,  Prop  IV-2-3].  O 

We  now  consider  conditions  under  which  we  may  find  a  probability  measure  P  en¬ 
joying  properties  (C.l)  and  (C.2)  and  which  is  equivalent  to  P. 

Proposition  2.1.  Suppose  that  {Lt} o°  is  uniformly  P-integrable,  and  let  P  be  defined  by 
dP/dP  =  Loo.  A  necessary  and  sufficient  condition  for  P  to  be  equivalent  to  P  is  that 
for  e  >  0,  there  exist  an  r/  >  0  such  that 

sup tP{Lt  <  rj}  <  e.  (2.3) 


Proof.  Note  that  for  t  =  0, 1, . . .  and  c  >  0, 

=  E[LtL^  1{  1>c}] 

(2.4) 

=  PiLf1  >  c} 

=  P{Lt  <  1/c). 

Consequently,  condition  (2.3)  holds  if  and  only  if  {Xt_1}g°  is  uniformly  P-integrable,  so 
the  Proposition  is  verified  by  invoking  Theorem  2.1.  Q 
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For  convenience,  define  the  processes  {iV*}o°,  and  {{iV)*} o°  by 


t 

Nf.=  J^x'sVs 

5=1 

N0  :=0 


«=i 

iV0  :=0 


(N)t  •=  E**As*s 

5=1 

<iV)0  :=  0. 

With  this  notation, 


t  =  0,1,...  (2.5) 


£  =  0,1,...  (2.6) 


<  =  0,1,...  (2.7) 


£*  =  exp [Nt  -  |<iV)J 
=  exp[iV4  +  \{N)A- 


t  =  0,1,...  (2.8) 


We  observe  in  passing  that  {(iV)t}2°  is  called  the  quadratic  variation  process  associated 
with  the  square-integrable  martingale  and  is  the  unique  (^)-adapted  process  such 

that  {Nf  —  (TV)^ }q°  is  a  martingale  ([20,  Chapter  8]). 

Anticipating  the  Girsanov  transformation  used  in  Chapters  III  and  IV,  we  now  assume 
condition  (D),  where 

(D):  for  t  —  1,2,...,  the  RV  \t  is  .T-o-rcieasurable. 

If  P  is  a  probability  measure  on  (SI,  T)  enjoying  properties  (C.l)  and  (D),  it  is  not  then 
difficult  to  see  that  the  processes  {Wlo0  and  {{JV)i}o°  have  the  same  joint  statistics  under 
P  as  the  processes  { iVt}o°  and  {(^)t}o°  have  under  P. 

These  observations  lead  to  the  following  result. 

Theorem  2.2.  Under  assumption  (D),  uniform  P-integrability  of  {Lt}^  is  a  necessary 
and  sufficient  condition  for  the  existence  of  a  probability  measure  P  equivalent  to  P  and 
with  properties  (C.l)  and  (C.2). 
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Proof.  Note,  from  (2.6)~(2.7),  that,  by  symmetry,  the  P- statistics  of  (A*,  (jV)t;  t  = 
0,1,...}  are  the  same  as  those  of  {—Nt,{N)t]  t  =  0,1,...},  so  the  P-statistics  of 
{Nt,(N)t\t  =  0,1,...}  are  the  same  as  the  P-statistics  of  {—Nt,{N)t;  t  =  0,1,...}. 
Uniform  integrability  being  a  statistical  property,  uniform  P-integrability  of  {P*}o°  is 
equivalent  to  uniform  P-integrability  of  {L^1  }j}°  by  (2.8).  The  proof  is  completed  with 
the  aid  of  Theorem  2.1.  Q 

Having  seen  the  significance  of  uniform  P-integrability  of  {Lt} q°,  we  now  provide  a 
counter-example  to  show  that  in  general,  {£<}g°  need  not  be  uniformly  P-integrable. 

A  Counter-example.  Let  n  =  1,  and  let  {nt}f°  be  a  scalar  zero-mean  standard  (Pt,  P)- 
GWN  process  (i.e.,  A*  =  1  for  t  =  0, 1, . . .).  Take  a  in  IP  with  Ja|  >  1,  and  let  )3  be  any 
square-integrable  P0-  measurable  random  variable  with  P{/3  =  0}  <  1.  Set 

Xt  =  atf3.  t  =  0,1,...  (2.9) 

We  shall  show  that  the  martingale 

Lt  exp  [/?  ~  Y 

5=o  ^s=o  t  =  1,2,...  (2.10) 

L0  :=  1 


is  not  uniformly  P-integrable. 

For  convenience,  define  the  processes  {wt}o°  and  {Mt}o°  by 


nt  :=]T 

s= 0 

n0  :=  0 


a  v , 


t  =  0, 1, . . .  (2.11) 


[n]t  :=  a- 


a 


it 


a2  —  1 


t  =  0,1,...  (2.12) 


Mo  :=  0 

and  the  collection  {Ht} §°  of  random  Borel  mappings  from  IP  to  IP  by 


Ht(x)  :=  exp[a;nt  -  -x2[n](] 
H0(x)  :=  1. 


t  =  0,1,...  (2.13) 
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Then  we  may  write 


Lt  =  MP), 


t  =  0,1,...  (2.14) 


so  that  the  representation 

=  lim 

holds.  We  shall  show  that 

Too  —  l{/3=0}> 

then  E[Loo]  —  P{/3  =  0}  <  1,  so  {Tt}o°  cannot  be  uniformly  P-integrable. 
Clearly,  lim<//e(0)  =  1.  Thus  it  is  sufficient  to  show  that  for  x  ^  0, 


(2.15) 


(2.16) 


=  0  P-a.s.; 


(2.17) 


by  standard  conditioning  arguments,  it  then  follows  that 

E[Lool{/3ji0}]  -  E[lhntHt((3)l^^0^] 
=  0, 


(2.18) 


and  Tool^o}  =  0  P-a.s.,  so  (2.16)  is  immediate.  We  shall  use  the  first  Borel-Cantelli 
lemma  to  verify  (2.17). 

Fix  x  ^  0  and  t  =  1,2,....  Noting  that  nt  is  a  normal  RV  with  zero  mean  and 
covariance  [n]t,  we  have 


where  (2.21)  follows  from  (2.20)  by  the  symmetry  of  an  N(0, 1)-  distributed  RV. 

Now  from  a  well-known  result  in  the  theory  of  Gaussian  RV’s  (see  [9],  Appendix  2) 

1  1  1 
Jb  exp(-*2/2)d*  -  2eXP<~262)  (2-22) 
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for  b  >  0,  so  that 


P{xnt  -  ^X2[n]t  <  ~x2\ >JJ  <  ^  exp ---') 

=  i«p(-^2K)' 


(2.23) 


Now  limpp  exp(  —p)  =  0,  so  there  clearly  exists  a  B  >  0  such  that  for  p  >  B,  p  exp(  —p)  <  1 , 
or,  equivalently,  exp(— p)  <  1/p.  Since  [n]t  /  oo,  there  exists  an  integer  T  such  that  for 
t  =  T,  T  +  1, _ ,  we  have  x2[n\t/ 8  >  B.  Hence 


t  =  T,T+  1,...  (2.24) 


But  by  inspection  of  (2.12), 


OO 

tt  <  °°’ 

SH 


so  that  by  Borel-Cantelli,  we  obtain 


(2.25) 


P{xnt  —  -£2[rc]t  <  —x2[n\t  i.o.  }  =  0. 


(2.26) 


Consequently, 


limsupt  ^xnt  —  ~^x2[n\^j  <  limsup{(-a;2[n]t)  =  — oo  P-a.s.,  (2.27) 


which  implies  that 

\\mtHt(x)  =  0  P-a.s.  (2.28) 

The  analysis  of  the  counter-example  is  complete.  O 

We  can  now  appreciate  the  following  criterion  which  ensures  that  the  sequence  {Lt} q° 
is  uniformly  P-integrable. 

Theorem  2.3.  Under  assumption  (D),  if 


E[(N)  J  <  oo, 


(2.29) 


then  {£«}o°  is  uniformly  P-integrable,  so  the  probability  measures  P  and  P  are  equivalent. 
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Proof.  It  is  not  difficult  to  see  (by  first  conditioning  upon  To)  that  for  any  t  =  0,1, . . 

=  ^[{-^  )<]  5:  £p)oJ.  Thus,  by  the  Vallee- Poisson  uniform  integrability  condition 

([6],  Cor.  1.19),  {|JV(|}o°  is  uniformly  P-integrable.  Since  {{N)t}  o°  is  increasing  and 
nonnegative,  (2.29)  also  implies  that  {(lV)t}o°  is  uniformly  P-integrable.  Since  the  sum 
of  two  uniformly  P-integrable  processes  is  itself  uniformly  P-integrable,  {|lVf|  +  }o° 

is  uniformly  P-integrable.  We  then  know,  using  [6,  Theorem  1.18],  that 


sup<P[|JV<|  +  -<jV)i]<oo. 


Take  e  >  0  and  c  >  0  such  that 


(2.30) 


Inc  > 


snPtE[\Nt\  +  \(N)t] 


(2.31) 


and  observe  that  for  all  ^  =  0,1,..., 


E[Ltl{Lt>c}\  =  Et[  l{n1>c}] 

=  Pt{Lt  >  c } 


=  Pt{Nt+-(N)t>\nc}. 


(2.32) 


Since  Nt  and  (N)t  have  the  same  joint  statistics  under  Pt  as  Nt  and  ( N)t  have  under  P, 
we  see  that 

Pt{Nt  +  ±(N)t  >  Inc}  =  P{Nt  +  ^(N)t>  Inc} 


<P{\Nt\  +  -(N)t  >lnc} 


< 


E[\Nt\  +  (N)t] 


In  c 


by  Markov’s  inequality.  From  (2.31),  we  then  get  that 


(2.33) 


E[Ltl{Lt>c}]  <  £ 


(2.34) 


for  all  t  =  0, 1, . . .,  so  the  process  {T4}q°  is  uniformly  P-integrable.  Q 

Finally,  we  show  that  if  for  t  -  0,1,...,  the  cr-field  Tt  is  separable  (i.e.,  generated 
by  a  countable  number  of  RV’s),  then  the  Daniell-Kolmogorov  theorem  enables  us  to 
construct  a  probability  measure  P  on  (fi,^)  which  enjoys  properties  (C.l)  and  (C.2), 
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